We consider a mobile user accessing contents in a dynamic environment, wherenew contents are generated over time (by the user's contacts), and remainrelevant to the user for random lifetimes. The user, equipped with afinite-capacity cache memory, randomly accesses the system, and requests allthe relevant contents at the time of access. The system incurs an energy costassociated with the number of contents downloaded and the channel quality atthat time. Assuming causal knowledge of the channel quality, the contentprofile, and the user-access behavior, we model the proactive caching problemas a Markov decision process with the goal of minimizing the long-term averageenergy cost. We first prove the optimality of a threshold-based proactivecaching scheme, which dynamically caches or removes appropriate contents fromthe memory, prior to being requested by the user, depending on the channelstate. The optimal threshold values depend on the system state, and hence, arecomputationally intractable. Therefore, we propose parametric representationsfor the threshold values, and use reinforcement-learning algorithms to findnear-optimal parametrizations. We demonstrate through simulations that theproposed schemes significantly outperform classical reactive downloading, andperform very close to a genie-aided lower bound.
展开▼